Establishment and verification of a nomogram model for predicting the risk of post-stroke depression

Objective The purpose of this study was to establish a nomogram predictive model of clinical risk factors for post-stroke depression (PSD). Patients and Methods We used the data of 202 stroke patients collected from Xuanwu Hospital from October 2018 to September 2020 as training data to develop a predictive model. Nineteen clinical factors were selected to evaluate their risk. Minimum absolute contraction and selection operator (LASSO, least absolute shrinkage and selection operator) regression were used to select the best patient attributes, and seven predictive factors with predictive ability were selected, and then multi-factor logistic regression analysis was carried out to determine six predictive factors and establish a nomogram prediction model. The C-index, calibration chart, and decision curve analyses were used to evaluate the predictive ability, accuracy, and clinical practicability of the prediction model. We then used the data of 156 stroke patients collected by Xiangya Hospital from June 2019 to September 2020 for external verification. Results The selected predictors including work style, number of children, time from onset to hospitalization, history of hyperlipidemia, stroke area, and the National Institutes of Health Stroke Scale (NIHSS) score. The model showed good prediction ability and a C index of 0.773 (95% confidence interval: [0.696–0.850]). It reached a high C-index value of 0.71 in bootstrap verification, and its C index was observed to be as high as 0.702 (95% confidence interval: [0.616–0.788]) in external verification. Decision curve analyses further showed that the nomogram of post-stroke depression has high clinical usefulness when the threshold probability was 6%. Conclusion This novel nomogram, which combines patients’ work style, number of children, time from onset to hospitalization, history of hyperlipidemia, stroke area, and NIHSS score, can help clinicians to assess the risk of depression in patients with acute stroke much earlier in the timeline of the disease, and to implement early intervention treatment so as to reduce the incidence of PSD.


INTRODUCTION
Currently, stroke is the second leading cause of death in the world (Gaete & Bogousslavsky, 2008), and it is also a significant cause of long-term disability in the middle-aged and elderly. There are more than 2 million new cases of stroke in China every year, and it is the disease with the highest disability rate and life loss among all the diseases in that country (Yang et al., 2017). The incidence of stroke is expected to increase further due to the aging population, the continued high incidence of risk factors (such as hypertension, hyperlipidemia, diabetes) and poor management. Although access to overall health services has improved, the availability of stroke specialist care varies across the country, especially in rural areas and remote mountain areas, which is one of the reasons for the poor prognosis of stroke patients (Wu et al., 2019). However, in some regions, interventions with low risk of adverse reactions (such as the use of antiplatelet and lipid-lowering drugs), stroke unit care and other effective interventions (for example, inadequate use of intravenous thrombolysis, anticoagulation and decompression in patients with indications) also lead to differences in stroke outcomes between regions.
Post-stroke depression (PSD) is the most common mental disorder after stroke, has a negative impact on the functional recovery, rehabilitation response and quality of life of survivors. In stroke patients, about one-third or more are affected by depression (Sivolap & Damulin, 2019), which makes it a serious social and public health problem, so the prevention and treatment of antidepressant is worth studying. However, among the consequences of stroke for survivors, post-stroke depression is the most frequent psychiatric problem. PSD is strongly associated with further worsening of physical and cognitive recovery, functional outcome and quality of life. Moreover, depression negatively affects the patients' ability to engage in rehabilitation therapies, a two-way association between depression and stroke has also been established: stroke increases the risk of depression, but depression is also an independent risk factor for stroke (Villa, Ferrari & Moretti, 2018). Approximately one-third of stroke patients may have PSD, but the prevalence of PSD varies from study to study, depending on demographic characteristics, diagnostic criteria, inclusion/exclusion criteria, time after stroke and clinical environment in which patients are examined, while the difference in most studies lies in the lack of a diagnostic standard and a unified diagnostic time for PSD, mentioned in a meta-analysis of 43 studies. About 39-52% of stroke patients were diagnosed with depression during 5-year follow-up, while the prevalence rate at any time in 5-10 years was about 29%. Interestingly, in patients with early depression after acute events, a considerable number recovered in the subsequent assessment (Lenzi, Altieri & Maestrini, 2008).
Due to the complexity of the pathogenesis of PSD, it is considered to be caused by social psychological factors, pathophysiological factors and other factors, so that there is not a systematic and reliable clinical treatment for PSD, and the therapeutic effect of PSD patients is usually poor (Starkstein, Mizrahi & Power, 2008). PSD is an important factor leading to poor recovery of neurological function after stroke, which not only greatly reduces the recovery of cerebral neurological impairment, but also may aggravate the symptoms of patients after stroke, and greatly affects the ability of daily life and work of patients. At the same time, the mortality rate of PSD patients is also significantly higher than that of stroke patients without PSD. According to related studies, the mortality rate of stroke patients with depression is more than 10 times higher than that of ordinary patients within 10 years (Llorca et al., 2015). In view of the fact that the clinical diagnosis of PSD is often insufficient, and the onset of PSD is often one month or more after stroke, most medical environments also lack enthusiasm for PSD. Compared with stroke without depression, PSD often shows higher mortality, worse neurological recovery, more obvious cognitive impairment and lower quality of life (Starkstein, Mizrahi & Power, 2008). Therefore, clinicians are familiar with and master the risk factors of PSD, so it is extremely important to carry out early prevention and treatment for such patients (Arseniou, Arvaniti & Samakouri, 2011). However, PSD is affected by a variety of risk factors, such as age, sex, education, occupation, income, smoking, drinking, etc.); factors related to social support (marital status, length of stay from onset to hospitalization, lifestyle, number of children, etc.); and disease-related factors (stroke site, NIHSS score, hypertension, diabetes, history of hyperlipidemia, etc.) (Rabi-Žikić et al., 2020;Schöttke et al., 2020). Considering so many related risk factors, accurate prediction of PSD tools and early intervention by clinicians is an effective means to improve the prognosis of PSD patients (Vogel, 1995). Although many previous literatures have studied the relationship between some risk factors and the occurrence of PSD (De Man-van Ginkel et al., 2013), no research has been done to predict the risk of PSD by combining these factors with the nomogram of PSD.
The purpose of our study is to establish a nomogram for risk prediction of post-stroke depression, allowing clinicians to conduct an early PSD risk assessment through clinical factors that are easily available in the early stage of the disease, and then carry out early clinical preventive treatment to effectively reduce the incidence of depression in stroke patients.

Patients
The study was approved by the Medical Ethics Committee of Xiang ya Hospital of Central South University (approval number: 201910842). A total of 156 patients with acute stroke (including ischemic stroke and hemorrhagic stroke) were collected from Xiangya Hospital of Central South University from June 2019 to September 2020. Data from 202 patients were collected from Xuanwu Hospital of Capital Medical University from October 2018 to September 2020. Each subject was scored with Hamilton Depression scale at 1 month and 3 months after the onset of acute stroke. According to the score of the third month, the stroke patients were divided into PSD group and non-PSD group.
Admission criteria: (1) after admission, imaging examination proved that the patient was the first stroke, which conformed to the diagnosis of stroke (Tay, Morris & Markus, 2021); (2) according to the DSM-5, it was consistent with the diagnosis of depression (Medeiros et al., 2020); (3) age between 18 and 80; (4) the time from stroke onset to hospitalization is not more than 14 days. Exclusion criteria: (1) greater than 80 years of age; (2) had a history of mental illness; (3) had severe language disorder and disturbance of consciousness; and (4) other major diseases (such as cancer) were diagnosed during follow-up.
Assessment criteria of patient-related risk factors, and clinical characteristics of the study subjects with and without PSD were summarized in Table 1.

Screen predictors
Through the LASSO regression analysis of R language, we analyzed the screened factors by multifactor logistic regression analysis, and finally screened out six predictive factors with modeling potential. These factors are independent influencing factors of PSD (P < 0.05).

Establishment of nomogram prediction model
According to the final results of multifactor analysis, the ratio of each risk factor to PSD (OR) was calculated and expressed as 95% confidence interval, and the risk prediction nomogram model of PSD was established by R software.

Validate PSD risk prediction model
We used a bootstrap repeated sampling for internal bootstrap verification and external verification using the data of 156 patients in Xiangya Hospital of Central South University.

Drawing decision curve
Finally, the decision curve was used to evaluate the clinical predictive value of the prediction model. By quantifying the net income under different threshold probabilities in the queue, the decision curve was analyzed to determine the clinical validity of the nomogram previously established.

RESULT
We used LASSO regression to select predictive factors to determine the risk factors associated with PSD from the collected patient data. Due to the large number of predictive factors, and this study is carried out in a small sample size, so LASSO regression analysis is selected to screen the most capable predictive factors. LASSO regression analysis was first proposed in 1996, this method is a kind of compressed estimation. By constructing a penalty function, it gets a finer model, compresses some coefficients, and sets some coefficients to zero. Therefore, the advantage of subset contraction is retained, which is a biased estimation for dealing with collinear data. The advantage of LASSO regression is to filter variables and adjust their complexity while fitting the generalized linear model. Therefore, no matter whether the predictive factors are continuous variables or binary or multivariate discrete variables, we can use LASSO regression to model and then predict. The variable filtering here selectively puts the variables into the model in order to obtain better performance parameters, but in this study, we first carried out single factor regression analysis, selected seven predictive factors, and then put these predictive factors into the model.As shown in Fig. 1 is binomial deviance, The lowest point of the curve is the optimal parameter lambda, Fig. 2 shows the LASSO regression coefficient map of all 19 risk factors, each curve corresponds to a risk factor, where the ordinate is the regression coefficient of the predictor, and the abscissa is log (lambda).
We used multivariate logistic regression analyses to calculate the ratio of each risk factor to PSD, (OR), 95% confidence interval (95% CI) and P value (P < 0.05. As shown in Table 2,      we found that the P value of drinking history did not reach the standard (P > 0.05), so we excluded it. Using the above prediction factors, we were able to establish the nomogram of PSD risk prediction as shown in Fig. 3. The internal bootstrap verification was carried out by repeated sampling with Bootstrap method, and its C-value was 0.71.

PSD (%) N-PSD (%) PSD (%) N-PSD (%)
The calibration curve consistency test results of the internal and external verification of the predicted and actual values of our prediction model showed that the PSD risk probability predicted by the nomogram model had a good correlation with the actual PSD risk probability. The results are shown in Figs. 4 and 5.
We further observed that the C-index of internal verification and external verification were 0.773 (95% CI [0.696-0.850]) and 0.702 (95% CI [0.616-0.788]), respectively, indicating that the model had a good ability to predict risk of PSD.
The analysis of the decision curve determines the clinical practicability of the nomogram by quantifying the net income under different threshold probabilities in the cohort. In DCA analysis, the Abscissa represents the threshold probability, that is, the probability that the patient will have an outcome event is predicted by the line chart model. When this probability reaches a specified threshold, clinical intervention measures will be taken for stroke patients. At this time, some patients can benefit from the clinical intervention, but there will also be patients who should be treated without intervention or excessive treatment. The ordinate represents the net benefit of the patient after the treatment benefit is subtracted from the treatment loss. Assuming that all patients are negative and all patients do not receive treatment, the net benefit of the patients is 0, showing a line parallel to the X axis on the chart; assuming that all patients are positive and receiving treatment, it is shown as a backslash that intersects the X axis. The curve of the prediction model is distributed between the above two curves, and the farther away from the above two curves shows that the model can get more benefits in a larger threshold range. We observed that when the threshold probability was greater than 6% (Fig. 6), the nomogram was highly practical in clinical practice.

DISCUSSION
Nowadays, nomograms have been widely used in many medical prognoses such as oncology, and its main advantage lies in its high accuracy and easily comprehensible results, so as to help clinicians to make better clinical decisions (Iasonos et al., 2008;Huang et al., 2016). In recent years, the incidence of stroke has increased precipitously, seriously endangering the lives of the middle-aged and elderly (Kao, Chen & Manjunath, 2020), and the occurrence of post-stroke depression can have a strong negative effect on the prognosis of stroke patients (Cai et al., 2019). However, there is almost no direct and effective treatment for post-stroke depression (Li & Zhang, 2020). This is also because post-stroke depression is the result of multiple factors and requires a comprehensive, long-term treatment, while early clinical interventions (Wang et al., 2019), such as placebo or antidepressant treatment, can greatly reduce the incidence of PSD and improve the prognosis of patients (Huff, Ruhrmann & Sitzer, 2001). Post-stroke depression was recognized by psychiatrists as early as 100 years ago, but systematic case-control studies did not begin until the 1970s (Robinson & Jorge, 2016). This is also due to the difficulty of collecting clinical cases of post-stroke depression and the limitations of late follow-up. Our study is the first to combine the clinical symptoms of patients with living environmental factors, using nomogram to assess the risk of depression in patients with acute stroke. We selected six predictive clinical factors (work style, number of children, time from onset to hospitalization, history of hyperlipidemia, stroke area, and NIHSS score) to develop an easy-to-use nomogram as a new predictive tool for evaluating and predicting the risk of depression after stroke.
These results indicate that the degree of work fatigue not only affects the severity of stroke, increases the physical burden of patients, but also has a great psychological impact on patients (Volz et al., 2016); the more tired the work of patients, the greater the psychological burden, leading to a higher incidence of PSD. Similarly, PSD is related to the number of children and the time from onset to hospitalization. We find that type of work, the number of children, and the length of time from onset to hospitalization can be encapsulated as social support factors. Specifically, when the type of work is mainly manual work such as farmers or workers combined with a smaller number of children, and less family support, the easier it is to induce psychological changes. The longer the time from onset to hospitalization means that patients receive less social support, making patients more prone to emotional distress and depression (Yusrini Susanti, Wardani & Fitriani, 2019).
PSD is associated with neural regions associated with stroke. We found that patients with indices of stroke in the anterior circulation have a higher risk of depression. This may be because the frontal lobe and temporal lobe of the anterior circulation are significantly related to the occurrence of post-stroke depression (Price & Duman, 2020). The frontal lobe is generally considered to play an important role in cognitive and emotional functioning, while the frontal lobe through the frontal-occipital tract pathway may be involved in the occurrence and development of depression (Howard et al., 2019;Nelson et al., 2018). It mainly depends on the ventromedial prefrontal cortex to process the relevant emotional information. From the point of view of molecular neuropathology, the expression level of mRNA in prefrontal lobe SULT2A, 11 β-hydrosteroid dehydrogenase and other factors closely related to emotion regulation is significantly increased in patients with severe depression (Yan et al., 2020), while the temporal lobe also plays an important role in the regulation of negative emotion. Many clinical imaging studies show that temporal activation in patients with depression is significantly higher compared to healthy subjects during tasks of negative emotion self-regulation (Maggioni et al., 2019).
PSD was found to be correlated with the history of hyperlipidemia and NIHSS score. These two risk factors can be summarized as the clinical symptoms of the disease. Usually, the history of hyperlipidemia and the higher NIHSS score often mean that the symptoms of the disease are more serious and the prognosis poor (Ilut et al., 2017), leading to greater psychological burden and increased negative emotions, thereby increasing the risk of PSD.
This study combines the clinical data of the two centers, due to the large number of influencing factors, and this study is carried out in a small sample size, so LASSO regression analysis is selected to screen the most predictive factors. The advantage of LASSO regression is to filter variables and adjust their complexity while fitting the generalized linear model. Therefore, no matter whether the predictive factors are continuous variables or binary or multivariate discrete variables, we can use LASSO regression to model, and then use nomogram to establish a good model to predict the risk of post-stroke depression. At the same time, calibration curve isused to evaluate the model, and it is found that the model has excellent accuracy and differentiation. However, because the quantitative criteria of sample size and influencing factors are different from those in previous studies, some of the results may not be supported. So increasing the sample size of the experiment, as well as more central samples, will increase the persuasiveness of the study, which is what we are doing.
In sum, an accurate risk assessment can help doctors understand the prognosis of patients early and take timely intervention measures (Almalki et al., 2018). Our PSD risk prediction model based on patients' clinical factors has a high clinical predictive value, which is helpful for clinicians to prevent early treatment of PSD while reducing the incidence of PSD, and greatly improve the prognosis of stroke patients (Gu et al., 2020;Ramasubbu, 2011).

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This work was financially supported by the National Key Research & Development Program of China (grant number 2017YFC1310000). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Grant Disclosures
The following grant information was disclosed by the authors: The National Key Research & Development Program of China: 2017YFC1310000.